Finite State Machines as a Design Technique for SAS Programs; or, Turing the State of SAS
نویسنده
چکیده
The concepts behind Finite State Machines (FSMs) are introduced, and their operation is shown to be more compatible with SAS than that of traditional procedural programming when processing non-uniform input. A visual representation of FSMs as directed graphs is shown, and an example FSM is drawn. A tabular representation is shown, and its equivalence to the digraph demonstrated along with procedures for converting between the two representations. A simple, real-world problem is described and a procedural program design developed (illustrated by flow-charts and pseudocode). A FSM is designed for the same application, and implemented as a SAS program. This program is contrasted with the traditional procedural approach. The situations for which FSM is a superior design technique are reviewed. INTRODUCTION DATA step processing performed by SAS is customarily performed by an implicit loop (called the DATA cycle) which iterates once for each input record. This is a helpful feature when processing uniform input, such as from a SAS data set or Relational Data Base (RDB) table. Much of the data stored in external files is uniform in structure and layout, but a significant portion is not. Files with non-uniform structure can originate in situations where the nature of the data is hierarchical. An example of a hierarchical file structure would be end-of-day sales data which has a header record for the store location reporting the sales, a second-level header record for the sale itself (including the day and time of the sale, customer information, the method of payment, etc), and detail records for each item sold (including item ID, quantity, unit price, etc). A file with non-uniform structure need not be hierarchical, however. Non-uniform, non-hierarchical files would be created when recording events in real time. Examples of this are logs created by manufacturing plants (chemical, pharmaceutical, automotive) recording vat temperatures, pH, valve opening/closing, inventory increase/decrease, etc. Another example of this type of file is the records kept by computer operating system to track use of resources, either for capacity planning or user charge-back. When a program starts/ends, or a file is opened/closed, or some other resource is impacted, a record is written. While you might think of this as hierarchical (resources are used by jobs or online users), the order is determined by the time of day, so the records in the file are not structured in a hierarchical fashion. In addition, resources may be used in an asynchronous fashion, or used by the operating system itself or one of its independent tasks (daemons/services), so the structure of the file becomes even more muddled. Files without uniform structure do not lend themselves to straight-forward processing within the implicit loop of the SAS data cycle. Some coders may cope with this by taking a traditional procedural-language approach and building a SAS program with multiple INPUT statements within loops or conditional statements. This approach makes the SAS program no different in structure from a COBOL, PL/I, C, or FORTRAN program. Such programs are alien to the SAS approach, and may be confusing to coders whose primary (or only) programming language is SAS. Another method of processing non-uniform input data hinges around setting some set of flags to indicate what data has been processed or is yet to be processed. Such programs are usually put together in an ad-hoc fashion, and I have found them to be error-prone and brittle (likely to malfunction when there is a small change in the expected input). Such logic is usually awkward to adapt to new data, as the mental effort of tracking multiple, semi-independent flags becomes exponentially difficult as the number of flags for each special case increases. Programs built this way tend to be heavily dependent on assumptions about the structure of the data (e.g. record B must always follow record A, and record D will always be the last one of the group, etc). Flaws in these assumptions, or small changes in the manner in which the input data is collected can cause failures which go undetected, because the coder relied on these assumptions without including any checks to insure that they remained true. Exception cases which occur in practice, but were unknown to the programmer at the time the code was created can have disastrous effect. Due to the reliance on unverified assumptions, such programs may produce incorrect output without giving any indication that there is anything amiss. Finite State Machines (FSMs) provide a coding technique with both the flexibility to handle non-uniform input data and the rigor to report unexpected conditions. Creating a SAS program based on the FSM model is a process which helps uncover exceptional situations. An added benefit is that the FSM model is based on an implicit data cycle, and fits the SAS DATA step processing flow of action perfectly. The FSM model only requires a single control variable (known as the “state”), rather than a collection of multiple switches. Properties of FSMs have been studied by computer scientists for many years, and are well documented in the literature. Despite that, the simple form of FSMs used in this paper is readily understood and implemented by novice-to-intermediate SAS programmers. I. THE EHS FILE The case study is based on processing of a hierarchical file. This file is produced for a fictional interlibrary system
منابع مشابه
Some improvements in fuzzy turing machines
In this paper, we improve some previous definitions of fuzzy-type Turing machines to obtain degrees of accepting and rejecting in a computational manner. We apply a BFS-based search method and some level’s upper bounds to propose a computational process in calculating degrees of accepting and rejecting. Next, we introduce the class of Extended Fuzzy Turing Machines equipped with indeterminacy s...
متن کاملRestricted cascade and wreath products of fuzzy finite switchboard state machines
A finite switchboard state machine is a specialized finite state machine. It is built by binding the concepts of switching state machines and commutative state machines. The main purpose of this paper is to give a specific algorithm for fuzzy finite switchboard state machine and also, investigates the concepts of switching relation, covering, restricted cascade products and wreath products of f...
متن کاملContracting Out Non-State Providers to Provide Primary Healthcare Services in Tanzania: Perceptions of Stakeholders
Background In the attempt to move towards universal health coverage (UHC), many low- and middle-income countries (LMICs) are actively seeking to contract-out non-state providers (NSPs) to deliver health services to a specified population. Research on contracting-out has focused more on the impact of contracting-out than on the actual processes underlying the intervention and contextual factors ...
متن کامل07-090r0 SAS-2 Transmit IDENTIFY three times.fm
This should not confuse a SAS-1.1 receiver, which is supposed to only honor the first SOAF it sees. As noted by Jeff Gauvin (LSI), the SL_IR state machine in SAS-1.1 (which handles IDENTIFY address frames) differs from the SL_RA state machine (which handles OPEN address frames) in its handling of unexpected address frames and SOAFs before EOAFs. SL_IR only honors the first SOAF it sees, while S...
متن کاملSAS macro program for non-homogeneous Markov process in modeling multi-state disease progression
Writing a computer program for modeling multi-state disease process for cancer or chronic disease is often an arduous and time-consuming task. We have developed a SAS macro program for estimating the transition parameters in such models using SAS IML. The program is very flexible and enables the user to specify homogeneous and non-homogeneous (i.e. Weibull distribution, log-logistic, etc.) Mark...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008